Consistency of Surrogate Risk Minimization Methods for Binary Classification using Strongly Proper Losses
نویسندگان
چکیده
We learnt that under certain conditions on weights, a weighted-average plug-in classifier (or any learning algorithm that outputs such a classifier for the same training sample) is universally Bayes consistent w.r.t 0-1 loss. One might wonder for what other learning algorithms can similar statements be made. Can some of the other commonly studied/used learning algorithms be shown to be Bayes consistent w.r.t. 0-1 loss? We’ve already seen results on Bayes consistency of the ERM algorithm w.r.t 0-1 loss at the expense of computational feasibility. At the other end of the spectrum, we have algorithms like SVM, Logistic Regression etc. that are ubiquitous and computationally feasible but do not directly operate on the 0-1 loss. A natural desideratum in such a situation would be ‘the best of both worlds’ i.e. can we somehow use the minimization of ‘surrogate’ regret by commonly employed learning algorithms as a proxy for minimizing the 0-1 regret?
منابع مشابه
Consistency of Surrogate Risk Minimization Methods for Binary Classification using Classification Calibrated Losses
In the previous lecture, we saw that for a λ−strongly proper composite loss ψ, it is possible to bound the 0 − 1 regret in terms of its ψ−regret. Hence, for λ−strongly proper composite loss ψ, if we have a ψ− consistent algorithm, we can use it to obtain a 0 − 1 consistent algorithm. However, not all loss functions used as surrogates in binary classification are proper, the hinge loss being one...
متن کاملSurrogate Regret Bounds for the Area Under the ROC Curve via Strongly Proper Losses
The area under the ROC curve (AUC) is a widely used performance measure in machine learning, and has been widely studied in recent years particularly in the context of bipartite ranking. A dominant theoretical and algorithmic framework for AUC optimization/bipartite ranking has been to reduce the problem to pairwise classification; in particular, it is well known that the AUC regret can be form...
متن کاملClassification Methods with Reject Option Based on Convex Risk Minimization
In this paper, we investigate the problem of binary classification with a reject option in which one can withhold the decision of classifying an observation at a cost lower than that of misclassification. Since the natural loss function is non-convex so that empirical risk minimization easily becomes infeasible, the paper proposes minimizing convex risks based on surrogate convex loss functions...
متن کاملChapter 11 Surrogate Risk Consistency : the Classification Case
I. The setting: supervised prediction problem (a) Have data coming in pairs (X,Y ) and a loss L : R×Y → R (can have more general losses) (b) Often, it is hard to minimize L (for example, if L is non-convex), so we use a surrogate φ (c) We would like to compare the risks of functions f : X → R: Rφ(f) := E[φ(f(X), Y )] and R(f) := E[L(f(X), Y )] In particular, when does minimizing the surrogate g...
متن کاملConsistency of structured output learning with missing labels
In this paper we study statistical consistency of partial losses suitable for learning structured output predictors from examples containing missing labels. We provide sufficient conditions on data generating distribution which admit to prove that the expected risk of the structured predictor learned by minimizing the partial loss converges to the optimal Bayes risk defined by an associated com...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013